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In this paper, we show how information geometry, the natmal geometry of discrete probability 
distributions, can be used to derive the quantum formalism. The derivation rests upon three ele- 
mentary features of quantum phenomena, namely complementarity, measurement simulability, and 
global gauge invariance. When these features are appropriately formalized within an information ge- 
ometric framework, and combined with a novel information-theoretic principle, the central features 
of the finite-dimensional quantum formalism can be reconstructed. 
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The unparalleled empirical success of quantum theory 
strongly suggests that it accurately captures fundamen- 
tal aspects of the workings of the physical world. The 
clear articulation of these aspects is of inestimable value 
not only for the deeper understanding of quantum the- 
ory in itself [1^, but for its modification (for example, to 
allow non- unitary continuous transformations [Mlj) and 
its further development, particularly for the development 
of a theory of quantum gravity (see [5], for example). 
However, such articulation has traditionally been ham- 
pered by the fact that the quantum formalism, in which 
these aspects are presumably encoded, consists of postu- 
lates expressed in an abstract mathematical language to 
which our physical intuition cannot directly relate. Over 
the last two decades, there has been growing interest in 
elucidating these aspects by expressing, in a less abstract 
mathematical language, what quantum theory might be 
telling us about how nature works, and trying to derive, 
or reconstruct, quantum theory on this basis [TJ [Sl410| . 

Much of the recent effort in reconstructing the quan- 
tum formalism is motivated by the hypothesis that the 
concept of information might be the key, hitherto miss- 
ing, ingredient, that may enable a reconstruction, and 
several attempts have been made to systematically ex- 
plore the reconstruction of the quantum formalism from 
an informational starting point (for example [3 [TTI - 
[18]). Although these approaches have yielded signif- 
icant insights, they are either incomplete (for exam- 
ple, [TTJ [121 [2]) or employ abstract assumptions that 
involve the assumption of the complex number field (for 
example, |16H18j ). Such assumptions significantly limit 
the degree to which the physical content of the quantum 
formalism can be elucidated since one of the most mys- 
terious mathematical features of the quantum formalism 
is being assumed at the outset. In this paper, we show 
that the principal mathematical features of quantum the- 
ory can be reconstructed using the concept of information 
without employing such assumptions. 

Our approach develops intimate connections, known to 
exist for some time, between structures that arise natu- 
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rally in classical probability theory on the one hand, and 
the quantum formalism for pure states on the other |191 - 
[22]. For example, Wootters [19] has shown in the frame- 
work of classical probability theory that one can quan- 
tify the degree to which two discrete probability distri- 
butions, p = (pi, . . . ,Pn) and p' = {p'l, . . . ,p'j^), can be 
distinguished given the same number of samples from 
each by means of the statistical distance, ds(p,p') = 
cos~^ yjpip'j) , between them. If one considers the 
statistical distance, ds(p,p'), between the probability 
distributions p and p' which characterize the results of 
projective measurement A when performed upon two 
A^-dimensional pure states u and v, respectively, and 
if one chooses A such that dg is maximized, Woot- 
ters shows that ds is equal to the Hilbert space dis- 
tance, d//(u,v) = cos~^ |u^v|, between u and v [T9]. The 
existence of such a connection is remarkable, and sug- 
gest that the usual formalism of quantum theory might 
owe at least some of its structure to the notion of dis- 
tinguishability that arises naturally in a purely classical 
probabilistic setting. 

Following Wootters, we adopt an operational ap- 
proach, and so take the probabilistic nature of measure- 
ments as a given. Accordingly the framework of classi- 
cal probability theory is taken as a starting point. We 
equip this framework with a metric, ds^ — \ ^idp^ /pi, 
the information metric (or Fisher- Rao metric), the in- 
finitesimal form of the statistical distance, rather than 
the statistical distance itself, as this suffices for the pur- 
poses of the reconstruction. This metric determines the 
distance between infinitesimally close probability distri- 
butions p = (pi, . . . ,pn) and p' = {p^, . . . ,p'j^). As we 
shall describe below, the information metric can be un- 
derstood as a natural consequence of the introduction of 
the concept of information into the probabilistic frame- 
work. Accordingly, we shall refer to this framework as 
the information geometric framework |23j . 

Within this framework, we formalize three elementary 
features of quantum phenomena, namely complementar- 
ity, global gauge invariance, and measurement simulabil- 
ity, detailed below. These features can be understood as 
assertions about the physical world quite apart from the 
setting of the quantum formalism within which they are 
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usually encountered [Uj , and are sufBciently simple to be 
taken as primitives in the building up of quantum theory. 
To these features, we add an information-theoretic princi- 
ple, the principle of metric invariance. From these ingre- 
dients, we reconstruct the principal features of the finite- 
dimensional quantum formalism, namely that pure states 
are represented by complex vectors, physical transforma- 
tions are represented by unitary or antiunitary transfor- 
mations, and the outcome probabilities (and the corre- 
sponding output states) of measurements are given by 
the Born rule. The present paper provides a stream- 
lined derivation of the key parts of the finite-dimensional 
quantum formalism, focussing on the essential ideas. The 
reader is referred to Refs. [MIES] for a more detailed dis- 
cussion of the underlying ideas and methodology, as well 
as a derivation of the remainder of the finite-dimensional 
quantum formalism. 



I. INFORMATION METRIC. 

We begin by giving a simple argument which shows 
how the information metric arises in a classical proba- 
bilistic setting from the concept of information. Suppose 
that Alice has two coins, A and B, characterized by the 
probability distributions p = {pi,P2) and p' — (pi,P2), 
respectively. Suppose that she chooses coin A, tosses it n 
times, and then sends the data to Bob, without disclos- 
ing to him which coin she chose. If Bob knows p and p', 
how much information does the data provide him about 
which coin was tossed? Intuitively, the more information 
the data provides, the more sharply the distributions are 
distinguished. 

Using Bayes' theorem and Stirling's approximation for 
the case where n is large, on the assumption that coins A 
and B are a priori equally likely to be chosen, one finds 
that 

^ =exp l^n^pan^j , (1) 

where Pa is the probability that the tossed coin is A given 
the data, and likewise for Pg [55]. When the probability 
distributions are close, so that p' = p-|-dp, the argument 
of the exponent can be expanded in the dpi to give 

^ =exp(2nds2) , (2) 

where ds^ = ^ dpl/Pi is the information metric. 

Now, the information gained by Bob, A/, is the reduc- 
tion in his uncertainty, and is therefore defined as 

A/ = [/(1/2,1/2)-[/(Pa,Pb), (3) 

with U being an entropy (uncertainty) function such as 
the Shannon entropy. But, since Pa+Pb = 1 and Pa/Pb 
is determined by ds, once U is selected. A/ is determined 



by ds. For example, if U is chosen to be the Shannon 
entropy [/(tti, 7r2) = — J2i InTr^, one finds that 

A/=l(nd.2)2. (4) 

This result immediately generalizes to the case 
where p and p' are A^f-dimensional probability distribu- 
tions {M > 2). Hence, from an informational viewpoint, 
it is natural to endow the space of discrete probability 
distributions with the information metric. 

Parenthetically, we remark that Wootters' statistical 
distance, d5(p, p') = cos^-^ {J2i VPiP'd' between the 
probability distributions p and p' is the minimum dis- 
tance between p and p' with respect to the information 
metric [SC. We do not, however, make use of this result 
in what follows. 



II. DERIVATION 

A. Construction of State Space. 

Measurement is idealized as a process that (i) when 
performed upon some physical system, yields one of N 
possible outcomes, with probabilities, pi, . . . ,pn, that 
are determined by the state of the system immediately 
prior to the measurement, and (ii) is reproducible, so 
that, upon immediate repetition of the measurement, the 
same outcome is obtained with certainty. 

1. Formalizing Complementarity. 

We take the first feature, complementarity, to con- 
sist of the general idea that, when a measurement is 
performed upon a system in some state, the measure- 
ment outcome only yields information about half of the 
experimentally-accessible degrees of freedom of the state. 
In the above classical probabilistic model of measure- 
ment, we can express this idea in a very simple way as 
follows: 

Postulate 1. Complementarity. When measure- 
ment A is performed, one of 2N possible events 
occur, but they are not individually observed. Out- 
come i is observed (i = l,...,N) whenever ei- 
ther event 2i — 1 or event 2i is realized. The 
events 1, . . . , 27V are assumed to occur with proba- 
bilities Pi, ... , P27V, respectively, so that 

P^ ^ P2^^1 + P2i, (5) 

where pi is the probability of outcome i. 

The Pq {q — 1,...,2N) can be summarized by the 
probability n-tuple P — (Pi, . . . , Pjat). As a result, of 
the 2A^ — 1 degrees of freedom of P, the measurement 
outcome only yields information about the pi , which con- 
stitute — 1 degrees of freedom. We shall shortly im- 
pose an additional constraint (global gauge invariance) 
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which imphes that only 2{N — 1) of the 2N — 1 degrees 
of freedom of P are physically relevant. Hence, the mea- 
surement yields information about exactly one half of the 
experimentally-accessible degrees of freedom in P. 

Intuitively, performing the measurement brings about 
the realization of one of 2N possible events but the ob- 
served outcomes coarse-grain over these events; when 
event 2j — 1 or 2i occurs, the measurement is (for some 
reason to be investigated) unable to resolve the individ- 
ual events, so that only outcome i is registered. This is a 
novel hypothesis, which, at this point in the derivation, is 
recommended by its simplicity, and remains to be judged 
by its explanatory power (namely its capacity to support 
a derivation of the quantum formalism) [3T] . 

2. Imposing the Information Metric. 

Next, we endow the space of probability distribu- 
tions P with the information metric, ds^ = j J2„ dPq/Pq^ 
where q = 1, . . . , 2N . It is convenient to define Qq — 
\/ Pq, where Qq G [0,1], since the metric over the Qq 
is then simply the Euclidean metric, ds^ = dQ\ + • • • + 
dQ\^, so that Q ~ {Qi:Q2: ■ ■ ■ ,Q2nY is a unit vec- 
tor that lies on the positive orthant of the unit hyper- 
sphere 3"^^^^ is a 2Af-dimensional Euclidean space. 

3. Representing Physical Transformations. 

We now consider transformations of state space which 
represent physical transformations of the system. We 
postulate that transformations of the state space, as- 
sumed one-to-one, preserve the metric over state space 
— that is, the information distance, d(Q,Q'), between 
any pair of infinitesimally close states, Q, Q', where d(-) 
denotes distance with respect to the metric over state 
space, is preserved. The essential idea here is that the 
discriminability of any pair of nearby states is a quantity 
that is intrinsic to this pair of states, and is therefore 
should remain invariant under reversible and determinis- 
tic transformations of the system |32) . 

Now, if one takes the Q themselves as the state space of 
the system, one immediately finds that continuous one- 
to-one transformations of the state space that preserve 
the information metric are not possible. A simple way 
to allow the existence of such transformations is to take 
the entire unit hypersphere, S*^^"^, as the state space 
of the system. That is, we take the state of the system 
as been given by a unit vector Q = {Qi, Q2, . . ■ , Q2nV , 
with Qq £ [—1,1], where the probabilities Pq are given 
by Pq = Qq. From the information metric over the P, it 
follows from the relation Pq = Qq that the metric over 
the Q is Euclidean, 

ds^ =dQl + dQl + --- + dQlj^. (6) 

We can summarize the above requirements as follows: 



Postulate 2. Metric Invariance. The state 
of the system is given by the unit vector Q = 
{Qi,Q2,---,Q2nV, with Qq e [-1,1], where the 
probabilities Pq are given by Pq^Q^. The metric 
over the Q is Euchdean, ds^ = dQl + dQl -I- • • • -I- 
d'Q2N ; which any transformation, Ai , of state space 
must preserve. 

It follows from this postulate that Q lies on the unit hy- 
persphere, 5^^"^, in a 2iV-dimensional real Euclidean 
space. From the requirement of metric preservation, it 
follows that J\4 is an orthogonal transformation of S'^^~^, 
so that every transformation can be expressed as Q' = 
MQ, where M is a 2iV-dimensional real orthogonal ma- 
trix. 

The above extension of the state space from the pos- 
itive orthant of 5"^^^^ to the entire hypersphere is 
an assumption which, although formally rather natural, 
presently awaits a clear physical basis. 

B. Global Gauge Invariance. 

The second feature, global gauge invariance, consists 
of the idea that one can find a representation of the state 
of a system such that, if one displaces a subset of the 
degrees of freedom of the state by the same amount, any 
physical predictions based on the state are left invari- 
ant. To formalize this feature, we begin by making a 
change of variables by expressing the state, Q, in terms 
of the probabilities pi,p2, . . . ,Pn, and N additional real 
degrees of freedom, 61,62, ... , 6m, so that, without loss 
of generality, 

Q2i--1 = y/PiC0s6, 

Q2i = ^iS\-n6i. 

Only the 6i can be subject to displacement since a dis- 
placement involving any of the pi would be experimen- 
tally detectable. Accordingly, we formalize the idea of 
global gauge invariance by requiring that 6i = 6{xi)^ 
where 6{-) is an unknown, non-constant, differentiable 
function to be determined, and that the transforma- 
tion Xi Xi + Xo for I = 1 , . . . , brings about no pre- 
dictive changes for any xo G I^- From this global gauge 
condition, we immediately draw the following postulate: 

Postulate 3. Gauge Invariance. The map M 

is such that, for any state Q e S'^^~^, the 
probabilities, p\,p'2, . ■ . ,p'm^ of the outcomes of 
measurement A performed upon a system in 
state Q' = A^(Q) are unaffected if, in any rep- 
resentation, (J>i',Xi)^ of the state Q, an arbitrary 
real constant, xoi is added to each of the Xi- 

Additionally, we draw the the requirement that the mea- 
sure, ^x{pi;xi), overpi, . . . ,pn,X\, ■ ■ ■ ,Xn induced by the 
metric over S'^^~^ is consistent with the global gauge 
condition. This requirement is necessary in order that 
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probabilistic inference using the measure as a prior over 
state space is consistent with our physical knowledge of 
the system. This requirement yields the following postu- 
late: 

Postulate 4- Measure Invariance. The mea- 
sure fJ,{pi] Xi) induced by the metric over state space 
satisfies the condition . . . ,pn', Xii ■ ■ ■ i Xn) = 

• ■ • ,Pn;Xi +Xo,---,Xn + Xo) for any xo- 



1. Determining the function 6{- 



From Eqs. ([5|, ([6]), and Q, 



ds--\Y.f^i:p^nx.)dxi 

i=l i=l 



(8) 



The measure, /ife;Xi)> over {pi, . . . ,Pn;Xi, ■ ■ ■,Xn) in- 
duced by this metric is proportional to the square-root of 
the determinant of the metric, and marginalizes to give 



(9) 



as the measure over Xi, where c is a constant. 

Now, from the Measure Invariance postulate, it follows 
by marginalization that the measure fJ.i{xi) satisfies the 
relation fj.i{xi + Xo) = f'iiXi) for all xoi a-nd is therefore 
independent of Xi- Hence, from Eq. (|9|, 6{x) = ax + b, 
where a, b are constants, where a since, by assump- 
tion, the function 9{-) is not constant. We can therefore 
write 



Q = {y/pi cos di , y/pi sin 01 



/PN Sm6'ArJ 



(10) 



2. Implementing Gauge Invariance, and the emergence of 
Complex Vector Space. 



From Eq. (10 1, the Gauge Invariance postulate, and 
the relation 6i = axi + b given above, one can show 
that M is restricted to one of two types: M has the gen- 
eral form 



M 



y(21) y(22) 



(11) 



where T^*-'^ has the form 



/ C 



cos if ij 



- sm ipij 

cos (fij 



1 

-1 



and where either (3 = (type 1), in which case 
is a scale-rotation matrix, or /3 = 1 (type 2), in which 
case is a scale-rotation-reflection matrix, with scale 
factor aij and rotation angle ipij in either case |33j . 



Now, the state Q can be faithfully represented by the 



complex unit vector 

V = (Qi + iQ2, 



2N-1 + iQ2N) 



(12) 



and, remarkably, one can then show that every trans- 
formation M of type 1 corresponds one-to-one with the 
set of unitary transformations of v, and that every trans- 
formation M of type 2 corresponds one-to-one with the 
set of antiunitary transformations of v. In particular, on 
the assumption that a parameterized transformation that 
represents a continuous physical transformation must re- 
duce to the identity for some value of the parameters, it 
follows that a continuous transformation must be repre- 
sented by unitary transformations. 



C. Representation of Measurements. 



The third feature, 
stated as follows: 



measurement simulahility, can be 



Postulate 5. Measurement Simulability. Any 

reproducible measurement. A', describable in the 
formalism can, insofar as its outcome probabilities 
and associated output states are concerned, be sim- 
ulated by an arrangement consisting of measure- 
ment A flanked by suitable interactions with the 
system. 

Given the results derived above, this postulate immedi- 
ately implies that A' can be simulated by the arrange- 
ment shown in Fig. [T] where U and V are unitary trans- 
formations representing the interactions with the system. 

The reproducibility of measurement A implies that the 
state of a system immediately after A has yielded out- 



come i is given by = (0, . 



. ,e 



i0j 



, 0)^, where 



is undetermined. Hence, the input state = U~ 
will yield outcome i. In order that the arrangement be- 
have like a reproducible measurement, the output state 
must be \i[ up to an overall phase, so that it suffices 
to choose Vvi = v'^ for i = 1,...,N, which implies 
that V= U"^ 




FIG. 1: Simulation of measurement A' in terms of measure- 
ment A. 

Since the v,; form an orthonormal basis, it follows 
from = U^^Vi that the also form an orthonor- 
mal basis. Therefore, any state v can be expanded 
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as X^i^^v^, where Ci = v-^v. With the input state v, 
the state measured by measurement A in the arrange- 
ment is Uv — 'Y^^c'^yi. From Eq. (12), the probabih- 
ties, pi, . ■ ■ ,pn, of the outcomes of measurement A per- 
formed on state v = (iii, . . . , vn) are given \>y pi — 
Therefore, in this case, the measurement yields out- 
come i, together with output state v,^, with probabil- 



ity |c; 



/|2 



which is the Born rule. 



III. DISCUSSION 



The physical irrelevance of the overall phase of a pure 
state is usually regarded as being a minor mathematical 
feature of the quantum formalism of little physical impor- 
tance. From this standpoint, one of the most surprising 
finding in the derivation is that the global gauge condi- 
tion (which expresses in a more general way the physi- 
cal irrelevance of the overall phase) is sufficiently strong 
as to transform a 2A'^-dimensional real formalism (where 
states are real unit vectors, and the transformations 
are the orthogonal transformations) into the familiar A^- 
dimensional complex vector formalism of quantum the- 
ory (where states are complex unit vectors, and the trans- 
formations are the unitary and antiunitary transforma- 
tions). In particular, the fact that the set of possible 
transformations one obtains is precisely the set of all uni- 
tary and antiunitary transformations (and neither more 
nor less) is not something that could, a priori, have been 
reasonably anticipated. 

The derivation provides a number of other important 
insights into the structure of the quantum formalism. 
From the perspective of the derivation, it is clear that 
the use of complex numbers in the quantum formalism is 
directly tied to the set of possible transformations of state 
space. For example, if the set of all orthogonal transfor- 
mations were allowed, then the complex form of the for- 
malism, whilst still possible to write down, would involve 
non-linear continuous transformations and would there- 
fore not appear mathematically natural. The derivation 



also suggests that information geometry is directly or in- 
directly responsible for many of its key mathematical fea- 
tures (such as the importance of square-roots of probabil- 
ity, and the sinusoidal functions that appear in a quan- 
tum state) , thereby providing significant new support for 
the hypothesis that information plays a fundamental role 
in determining the structure of quantum theory. 

Finally, the derivation illuminates a previous partial re- 
construction of quantum theory due to Stueckelberg |26j . 
Stueckelberg makes an assumption similar to the Com- 
plementarity postulate to arrive at the idea that the state 
of a system is given by a 2iV-dimensional probability dis- 
tribution which can be written as a unit vector in a 2N- 
dimensional 'square- root of probability space', as we have 
done. He then asserts that the allowable transformations 
of the state space are orthogonal transformations, and 
shows that, if the transformations are restricted by a su- 
perselection rule, then the set of restricted transforma- 
tions is equivalent to the set of unitary transformations 
acting on a suitably-defined TV-dimensional complex state 
space. The present derivation shows that Stueckelberg's 
assertion that the allowable transformations are orthog- 
onal transformations can be naturally accounted for in 
terms of the information metric over the probability sim- 
plex via the Metric Invariance postulate. The derivation 
also shows that Stueckelberg's superselection rule can be 
replaced by the Global Gauge Invariance postulate. 
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